Data Science is getting very popular and many people are trying to jump into the bandwagon, and this is GREAT. But many assume that data science, machine learning, plug any other buzzword here, is to plug data to some Sckit-Learn libraries. Here is what the actual job is.

To bring you into context, the following is happening after the data was collected. Don’t get me wrong, I don’t think it should be considered a simple step, but I would like to focus on data pre-processing and normalization.

Continue Reading "This is what I really do as a Data Scientist"

A more efficient loss function for Siamese

At work, we are working with Siamese Neural Net (NN) for one shot training on telecom data. Our goal is to create a NN that can easily detect failure in Telecom Operators networks. To do so, we are building this N dimension encoding to describe the actual status of the network. With this encoding we can then evaluate what is the status of network and detect faults. This encoding as the same goal as something like word encoding (Word2Vec or others). To train this encoding we use a Siamese Network [Koch et al.] to create a one shot encoding so it would work on any network. A simplified description of Siamese network is available here. For more details about our experiment you can read the blog of my colleague who is the master brain behind this idea.
Continue Reading "Lossless Triplet loss"