Advanced Analytics and Deep Learning Models. Группа авторов

Advanced Analytics and Deep Learning Models

works that it extracts the client’s perception for that domain for recommending the items that will satisfy his requirements the best. The core strength or advantage is that it does not need any previous rating of that problem. By using this approach, it can overcome the cold start problem. But it has a disadvantage also that it needs experienced engineering with all its attendant difficulties to understand the item domain satisfactorily [1].

There is another approach in the recommender system known as the hybrid approach. This approach is made to overcome the limitation of both collaborative and content-based filtering approach. It combines the strength of collaborative and content-based approach they are by combining multiple recommendation algorithm’s implementations into a single recommendation system to improve the efficiency of the recommendation system which, in turn, would show better performance. The hybrid approach is generated by combining two or more algorithms. We must take care of two major points over here. First is keeping an account of the recommendation models that declare the required inputs and the determination of the hybrid recommender system. The second point is determining the strategy that will be used within the hybrid recommender. But there are also certain demerits prevailing in this hybrid approach like it not cost-effective, i.e., it is very expensive to implement because it is an amalgamation of other filtering methods. Moreover, it increases the complexity and, sometimes, needs outside data which is unavailable most of the time [1, 18].

3.4 Comparison Among Different Methods

Now, we will make a comparison between some methods used by researchers around the globe and will see about the result of their research.

3.4.1 MCRS Exploiting Aspect-Based Sentiment Analysis

In this research activity, Musto et al. proposed a CF technique based on MCRS, which utilizes the information to analyze users’ interests conveyed by users’ reviews.

In their experimental data analysis, they use many traditional models for evaluation. The outcomes showed the perception in back of this research [5].

Now, if we look in their experimental data analysis, then we can see that they have used three datasets. Those are Yelp, TripAdvisor, and Amazon.

Table 3.1 Dataset statistics.

	Yelp	TripAdvisor	Amazon
Users	45,981	536,952	826,773
Items	11,573	3,945	50,210
Rating/Reviews	229,906	796,958	1,324,759
Sparsity	99.95%	99.96%	99.99%

This framework is mainly for aspect extraction and sentiment analysis. For implementing this, we need different types of parameters. In the first step, we need to remove the words like “a”, “and”, “but”, “how”, “or”, and “what”. In the next step, we need to set the framework in between 10 and 50 for extracting the aspects and sub-aspects. To calculate the efficiency of sub-aspects, the main aspects were extricated, in some experimental session. As per the sentiment analysis algorithmic program, both “CoreNLP” and “AFINN-based” algorithms were used. They set KL-divergence score value as 0.1. They used both user-based and item-based CF system. Previously, they have used an advance version of Euclidean distance, which they introduced as multi-dimensional Euclidean distance for calculating the neighborhood. By their formula for all the dataset, neighborhood size was set to 10, 30, and 80, and they did it because the bigger neighborhoods will reduce in the efficiency of the proposed algorithm [5].

The effectiveness of their algorithmic program was planned by calculating the average of the Mean Average Error (MAE). Rival framework is used to calculate the matrices, to make sure the dependability in results [5].

3.4.1.1 Discussion and Result

In this demonstration, they analyzed discrete arrangements. Those are mainly based on aspect-based sentiment analysis. The results we can see in Tables 3.2 and 3.3. They stated the results picked up with AFINN sentiment analysis algorithm, due to space reasons. It did not come out with any major dissimilarity with the CoreNLP algorithm. As it is based on CF user-based approach, on Yelp dataset and Tripadvisor dataset, they took top 10 aspects from the datasets. Besides, the above results are better than the previous 50 aspects. Accordingly, they did not take a bigger space.

One more attractive result comes from the Yelp and TripAdvisor by use of sub-aspect which gave a significant improvement in performance. Here, the maximum efficiency came by using the top 50 aspects. For a better understanding of this, we need to do further investigations [5, 20].

Table 3.2 Result comparison.

Result of MCRS-Based CF Experiment 1

Result of Experiment 2

Скачать книгу

Configuration	Dataset			Dataset
#neigh.	#asp.	Sub-asp	Yelp	TripAdvisor	Amazon	Configuration	Yelp	TripAdvisor	Amazon
10	10	Y	0.8362	0.7111	0.6464	Multi-U2U	0.8362	0.7111	0.6276
10	10