The goal of this workshop is to create a recommendation system based on user ratings. The workshop is based on one of the exercises proposed at the Spark Summit.
We’ll use one of the Movielens datasets that already reside on the platform. We’ll do it in four steps:
- Ingestion and data preparation using Pipelines. Viewed at Analytics Workshop Sofia2. Data Ingestion. (Part 1/4)
- Creating the model using a Notebook. Viewed at Analytics Workshop Sofia2. Notebook. (Part 2/4)
- Ontology Generation.
- Creating a simple display.
Today we are going to generate an ontology from the HIVE table we created in the previous step. To do this, go to ‘Analytics menu’ and select ‘UTIL HIVE_To_Ontology’. You will see a window with a list of available tables. The table you just created should not appear. This happens because the table is HIVE and that list shows IMPALA entities. Therefore, you have to give visibility to the table. To do this, click on “Visualize Hive table” button:
You’ll see another window, where our table should appear. Select it and click on ‘Regenerar Metadatos’ (Regenerate Metadata):
Once executed, return to the previous window with the “Cancel” button. Now our table appears in the list:
Once the table is chosen, click on “Generate Schema” and finally click on “Create”.
A window appears with the data of the created ontology. There is only one more step, which involves activating the ontology. Click on “Modify” button, which is at the bottom of the page. Another window will open, check the “Active” box:
Finally, we generate the instance and click on “Save” button. To work with it we have to associate a valid ThinKP. If you already have one created you can associate it to this ontology in “My ThinKPs” -> Edit (you have to choose the ThinKP), adding the ontology to the list associated with the ThinKP. For this workshop, we will create a new one.
Click on ‘THINKPS SOFIA2’ -> “My ThinKPs” and click on ‘New ThinKP’:
You will see a new window in which you have to fill in “Identification” with the name of the new ThinKP, and choose the ontologies to which the ThinKP will have access.
To mark more than one ontology, use Ctrl + Shift.
Once you have filled in the data, click on “New” and a summary window of the ThinKP will appear:
Now the ontology is ready. Go to ‘TOOLS’ menu and make some query on the newly created ontology.
It’s a good practice restrict queries results in Sofia2 console with “limit number_registers” (Eg select * from ontology limit 5)
So far the Ontology Generation. In the next and last post we’ll see how to create a simple display with Gadgets and a Dashboard.
We’ll wait for you.