Analytics Workshop Sofia2. Notebook. (Part 2/4)

The goal of this workshop is to create a recommendation system based on user ratings. The workshop is based on one of the exercises proposed at the Spark Summit.

We’ll use one of the Movielens datasets that already reside on the platform. We’ll do it in four steps:

  • Ingestion and data preparation using Pipelines.
  • Creating the model using a Notebook.
  • Ontology Generation.
  • Creating a simple display.

With the help of Sofia2 notebooks we are going to generate the movie recommendation model using the data we uploaded on the platform in the previous exercise. We propose to carry it out with Spark using Scala, and more concretely, we’ll implement the ALS.

Input data paths Definition

The first step is reading movie data and ratings, and, to do this, you have to define the data path. Define ratings_path and movies_path variables with the corresponding paths where you have loaded to the platform. For example:

Downloaded data paths Definition

image340

Tip: If we do this workshop at Sofia2.com/console we have to change ‘sofia2-analytic:8020’ by ‘localhost:8020

Structure the data

The next thing to do is to save movie information and ratings. We’ll read this information through Spark RDDs

You need to define a specific format for both movies: (movieId, movieName) and rating: (timestamp % 10, Rating (userId, movieId, rating)).

We also took advantage of importing Mlib libraries we will use in the example. In particular, you need ALS, Rating y MatrixFactorizationModel.

Save data in Rdd

image341

Data checks

Now, check that the data has been read. How many ratings did you download? How many films are in the catalog? How many movies have been scored? And how many users have done it?

image342

Split Dataset

Before building the model, the dataset must be splitted into smaller parts, one for training (60%), one for validation (20%) and another for testing (20%).

image343

Function to evaluate the model

Once data is splitted, define the function that will evaluate the performance of the model. In particular we will use Root Mean Squared Error (RMSE)) and this is the version in Scala:

image344

Choice of model

Now you can use this function to define the parameters for the training algorithm. The ALS algorithm requires 3 parameters: the range of the factor matrix, the number of iterations and a lambda. We will define different values for these parameters and try different combinations of them to determine which one is the best:

image345

Wich one do you think is the best model?
Now let’s launch our function on data test.

image346

User recommendations Performance

Once the best model is chosen, the next step is to know the recommendations of movies per user. The idea is to ask for the user, which for the Dataset used is a numeric. Let’s do it form type, so first will ask for the user, insert it into a text field and finally release the recommendation. To ask for the user:

image347

For this example, we configured it to show the 10 best recommendations for the user inserted in the text field.

image348

Persist the recommendations

Now we only have to save the best recommendations for each user in ontology. The idea is to save records of the form: UserId, MovieName, MovieGenre.

image349

We create the HIVE table with data stored in the DataFrame. It modifies the name of the table of the image ” recomendaciones_arturo ” by a unique identifier, for example, recomendaciones_yourname.

image350

Analytics Workshop Sofia2. Notebook. (Part 2/4)

Sofia2 API for Rephone Kit

This entry will show you a step-by-step guide to follow in order to POST a message using REST API to Sofia2 Smart Platform, by using the versatile Xadow BLE+GSM module from Seeed. This module is the central unit of the Rephone Kit, a modular approach designed to build innovative IoT scenarios.

Xadow BLE+GSM is a tiny development board and a great fit for mobility applications. Based on one of the smallest chips in the market, Xadow BLE+GSM provides developers with an enriching blend of communication technologies, thanks to Bluetooth Smart (BLE) capabilities paired with a 2G GSM/GPRS modem (850/900/1800/1900MHz).

post1

The board is based on the MT2502 SoC from Mediatek, featuring:

  • Micro-controller: 32-bit ARM7EJ-STM RISC processor
  • RAM Memory: 4 MB
  • FLASH Memory: 16 MB
  • Power supply: 3.3~4.2V (no SIM) / 3.5~4.2V (with SIM)
  • Power consumption: 20mW (@standby, no radio), 30 mW (@standby GSM), 45 mW (@standby BLE)
  • 4-band modem: (850/900/1800/1900 MHz)
  • GPRS modem class 12
  • Clock: 260 MHz
  • Connector: 35-pin connector and PIN Connector to interconnect Rephone extra modules (GPS board, GPIO expansion board, LCD,…) JST1.0 Battery connector.
  • Interfaces: LCD, Audio, I2C,SPI, UART ad GPIOs

It is worth noting the wide choice of programming languages supported by Xadow BLE+GSM, offering SDKs for C/C++ (using Eclipse), LUA, Arduino and Javascript. Our recommendation is to use the Eclipse option with C/C++ since it provides the greater API flexibility (other choices may lack some libraries).

Seguir leyendo “Sofia2 API for Rephone Kit”

Sofia2 API for Rephone Kit

Taller Analytics Sofia2. Notebook. (Parte 2/4)

El objetivo de este taller es crear un sistema de recomendación en base a los ratings de los usuarios, basado en uno de los ejercicios propuestos en el Spark Summit.

Utilizaremos uno de los Datasets de Movielens que ya reside en la plataforma. Lo haremos en cuatro pasos:

  • Ingesta y preparación de los datos mediante Pipelines.
  • Creación del modelo mediante Notebook.
  • Generación de Ontología
  • Creación de una visualización sencilla.

Con ayuda de los notebooks de Sofia2 vamos a generar el modelo de recomendación de películas usando los datos que hemos cargado en la plataforma en el anterior post. Proponemos llevarlo a cabo con Spark usando Scala, y más concretamente implementaremos el ALS.

Seguir leyendo “Taller Analytics Sofia2. Notebook. (Parte 2/4)”

Taller Analytics Sofia2. Notebook. (Parte 2/4)

IoT Devices on Sofia2. Integration and Management (II. User and Ontology creation)

This is the second post of the series “IoT Devices on Sofia2. Integration and Management”. This post will cover the actions required to start building the demo scenario presented on the first part:

 

Part I. Overview

 

sensortag-sofia2-connection

Essentially, you will be creating a new user on the platform and you will upgrade it to the appropriate level to take full advantage of Sofia2 features highlighted in this tutorial. Besides, we will propose you a model for the ontology that will store the data obtained from SensorTag (http://www.ti.com/sensortag). Seguir leyendo “IoT Devices on Sofia2. Integration and Management (II. User and Ontology creation)”

IoT Devices on Sofia2. Integration and Management (II. User and Ontology creation)

Dispositivos IoT en Sofia2. Integración y Gestión (II. Creación de usuario y ontología)

En este segundo post de la serie Dispositivos IoT en Sofia2. Integración y Gestión, se cubrirán los primeros pasos del desarrollo del escenario que se introdujo en la primera parte:

Parte I. Introducción

sensortag-sofia2-connection

En concreto, registraremos un usuario nuevo en la plataforma con los roles adecuados para explotar el potencial de la solución propuesta. Además diseñaremos la ontología que recogerá los datos obtenidos del SensorTag (http://www.ti.com/sensortag). Seguir leyendo “Dispositivos IoT en Sofia2. Integración y Gestión (II. Creación de usuario y ontología)”

Dispositivos IoT en Sofia2. Integración y Gestión (II. Creación de usuario y ontología)

Analytics Workshop Sofia2. Data Ingestion. (Part 1/4)

The goal of this workshop is to create a recommendation system based on user ratings. The workshop is based on one of the exercises proposed at the Spark Summit.

We’ll use one of the Movielens datasets that already reside on the platform. We’ll do it in four steps:

  • Ingestion and data preparation using Pipelines.
  • Creating the model using a Notebook.
  • Ontology Generation.
  • Creating a simple display.

Seguir leyendo “Analytics Workshop Sofia2. Data Ingestion. (Part 1/4)”

Analytics Workshop Sofia2. Data Ingestion. (Part 1/4)

Despliegue remoto de flujos en Node-RED con Sofia2

Como ya hemos comentado en otros posts, Node-RED es una herramienta visual que proporciona un motor de flujos ligero capaz de ejecutarse en dispositivos de capacidades reducidas, como puede ser una Raspberry Pi.

En este post vamos a mostrar cómo es posible editar de forma centralizada, en la consola de administración de Sofia2, flujos Node-RED, que posteriormente se desplegaran de forma remota en dispositivos. Como dispositivo de ejemplo y de video demostrador utilizaremos una Raspberry Pi modelo A.

El despliegue remoto de flujos, lo realizará en el dispositivo, un ThinKP bastante ligero, cuya misión será, a través del protocolo de mensajería SSAP de Sofia2, recibir nuevos flujos editados en la plataforma y desplegarlos en la instancia de Node-RED local del dispositivo.

image001

Seguir leyendo “Despliegue remoto de flujos en Node-RED con Sofia2”

Despliegue remoto de flujos en Node-RED con Sofia2