Softala is Haaga-Helia’s ICT living lab. The project was part of IoT Rapid-Proto Labs, which is an Erasmus+ funded programme creating new IoT talent in Europe. In the programme, students from different European universities have the opportunity to contribute to real-life business cases with IoT.
The main focus of the synthetic sensor project was to teach AI to understand if an office space or a classroom is dark or lighted up by sun or lights. Machine learning and SVM algorithm were used in order to measure this specific setup.
Devices and Architecture
The device in use, Raspberry Pi, was connected with the multipurpose sensor, Matrix Creator. Matrix Creator measures a multitude of things, with the following sensors: Microphone, 3D accelerometer, temperature and humidity and UV sensor. Raspberry Pi has Linux as an operating system, WiFi capability, local storage based on a Micro SD card and an easy pluggability. Raspberry Pi is a versatile and open platform with lots of reference available online. However, in this case the measurement data is sent to the cloud.
Matrix Creator, connected to Raspberry Pi, works as a shield creating a standard connection structure. Raspberry Pi sends data for analysis over the internet to a cloud as machine learning takes too much resources and space for Raspberry Pi to cope.
Machine learning algorithms generally need manually labelled training data, which in this case has sensor values of UV radiation correlating with how dark or light the room is. The training data measurements do not need to be excessive thus in this case, 15 rows of SVM sample values is enough. Testing data is used to verify how the SVM algorithm works but must be separated from the available training data.
Analyzing tools and algorithms
Scikit library for Python and Weka were two major tools used for analysis. Weka has graphical user interface, whereas Python Scikit is just a programming library. The graphical user interface can be beneficial when quickly testing different algorithms and evaluating which algorithm to use.
The analysis varies a lot depending on how many variables are affecting the situation. If, for example, an IR radiation is added to the light measurement the analysis would be two-dimensional.
The result of the analysis is a confusion matrix. Developing machine learning means facing false positives (abbreviated fp) and false negatives (abbreviated fn) leading to results where the machine learning model makes false classifications. This phenomenon is in the process thus certain percentage of fn’s and fp’s always occur.
The lightness level of a room depends on the amount of the daily light or other source of light. In this project it was possible to teach the machine learning algorithm to recognize events, whether the room has light or not, by assessing UV light. The tests failed in streaming the data real time, but it is possible to process a dataset and judge if certain value means light or dark conditions.
At this moment the system demands manual starting of the software and collecting data. If we want to develop the system for the business life purposes, it is essential that the analysis of the measuring data coming from the environment would be automatized. The possibilities to applicate the sensors, which observe the environment are almost limitless.